[00:01.240 --> 00:07.640]  Okay, hi everyone. This is Zoltan and Hiram's...
[00:09.800 --> 00:17.000]  in conjunction with Microsoft and Cujo AI. They're launching a very interesting security competition
[00:17.680 --> 00:21.180]  which kind of relates back to some of the stuff we were talking about in opening remarks
[00:21.800 --> 00:54.640]  and I'll let them take it away. Sorry about the audio.
[00:59.930 --> 01:06.650]  I will get the audio working. I am sorry. Why is it not capturing the right thing?
[01:22.320 --> 01:27.920]  I'm Balazs and this is a project together with Hiram Anderson.
[01:29.220 --> 01:36.380]  We are going to talk about machine learning detection bypasses and in the past you probably
[01:36.380 --> 01:46.040]  have seen some researches where people modified existing images and the goal was to bypass
[01:46.040 --> 01:54.480]  machine learning classifiers. So for example for a human viewer the new image looks like the
[01:54.480 --> 02:02.240]  original one but for a machine learning classifier this looks like something different. Like it will
[02:02.240 --> 02:11.440]  believe that this looks like an ostrich. In this talk we are going to present machine learning
[02:11.440 --> 02:18.620]  bypasses when it comes to malicious software and there have been some interesting researches in
[02:18.620 --> 02:28.040]  the past regarding this topic as well. For example last year there was a new research published
[02:28.480 --> 02:36.580]  where people extracted strings from a known game executable, appended this to a known malware
[02:36.580 --> 02:45.000]  and it was able to bypass production machine learning model. I also did some research in
[02:45.000 --> 02:51.120]  the past and like four years ago you were able to bypass some machine learning models
[02:51.120 --> 02:55.520]  just by packing a sample with UPX.
[02:57.020 --> 03:03.800]  In order to advance the field of offensive and defensive machine learning based malware
[03:03.800 --> 03:12.000]  detection, last year we created a challenge where you had to download 50 working malware samples,
[03:12.000 --> 03:16.260]  you had to download three machine learning models with eight weights,
[03:16.260 --> 03:20.520]  modify the malware samples to evade detection by all models
[03:20.520 --> 03:29.460]  and if you were lucky and you had the most points you were able to win this nice GPU card.
[03:30.180 --> 03:36.480]  In total 70 people registered for this competition and at least 11 people were
[03:36.480 --> 03:43.860]  able to bypass at least one machine learning model. Congratulations to the winner William
[03:43.860 --> 03:52.920]  Fleschmann and I highly recommend you to check out his blog post on how he won this competition.
[03:54.100 --> 04:01.640]  There were some other write-ups and papers as well. I do recommend you to check out those as
[04:01.640 --> 04:08.140]  well. You can see one from Jakub and one from Fabrizio on the following links.
[04:09.100 --> 04:16.740]  When it comes to win this competition, multiple approaches were used. Some people started
[04:16.740 --> 04:26.840]  with a simple packer like the one I mentioned, but unfortunately some of the samples are already
[04:27.400 --> 04:41.080]  packed in a way and this means that if you use UPX or something similar it will not work anymore.
[04:43.200 --> 04:51.320]  Another great approach was to add new sections to the executable. For example, you can extract
[04:51.900 --> 04:58.820]  the end user license agreement resource from Microsoft Files and add it to the malware
[04:58.820 --> 05:07.920]  samples multiple times. This approach was really good at bypassing the detection for the ML models
[05:07.920 --> 05:13.740]  but unfortunately again this broke some of the malware binaries.
[05:14.980 --> 05:24.040]  Fun fact, if you just simply add sections to a malware sample, you might be able to bypass
[05:24.040 --> 05:33.400]  some antivirus detection because for performance reasons some AV engines check the number of
[05:33.400 --> 05:43.720]  sections before they evaluate the rest of the rules. At the end, the winning strategy was
[05:43.740 --> 05:51.690]  to just append random data to the end of the executable. This is called as an overlay.
[05:52.280 --> 05:58.910]  Even though this is a very simple strategy, it worked during last year's competition.
[06:02.750 --> 06:12.800]  This is also an easy way to bypass if the sample has any kind of self-protection, for example.
[06:13.740 --> 06:19.240]  Just by increasing the size of the sample, again, you might be able to bypass some
[06:19.240 --> 06:28.000]  antivirus engines. Again, they can have a file size in their rules.
[06:31.220 --> 06:41.200]  And one important thing, as you can see on the top right image, that's a visual representation
[06:41.200 --> 06:51.780]  of a malware. And I just appended some random strings to the end of this sample. And if you
[06:51.780 --> 07:02.820]  look at the green visual, you can clearly see how this changed the visual representation of the
[07:02.820 --> 07:12.180]  sample. For us, there were some key takeaways from last year's competition. For example,
[07:12.180 --> 07:17.820]  some of the machine learning models are way too academic, but not very effective in practice.
[07:18.740 --> 07:27.740]  Turned out it's not just us, but everybody thinks that the DF tool is awesome. This is a Python
[07:27.740 --> 07:36.560]  package you can use to modify binaries. And as it is the case with malware, it is always tricky
[07:37.050 --> 07:41.940]  to deal with them. For example, some of the samples do not reproduce the same
[07:42.820 --> 07:49.360]  indicators of compromise over time. This can be because, for example, the command and control
[07:49.360 --> 07:55.640]  server is down, and dealing with packed and protected samples can be hard sometimes.
[07:57.740 --> 08:06.260]  I also checked the SSDeepHashes of some of the samples, and it was interesting to see that
[08:06.780 --> 08:15.860]  whenever people added repeating patterns to the sample, for example, the same section, or they
[08:16.820 --> 08:23.220]  added the same overlay over and over again to the sample, then it created a repeating pattern
[08:23.220 --> 08:33.180]  in the SSDeepHash as well, which can be used for detecting a sample which uses
[08:33.180 --> 08:40.980]  machine learning evasion, for example. This year, we created the Defender and the
[08:40.980 --> 08:46.080]  Attacker challenge. In the Defender challenge, you had to create your own machine learning model
[08:46.080 --> 08:52.500]  and submit this to the competition in a Docker format, and in the Attacker challenge,
[08:52.500 --> 08:59.680]  now the machine learning models are not available for you, so this is now a black box challenge.
[09:00.440 --> 09:06.120]  And if you win this competition, either the Defender or the Attacker challenge,
[09:06.120 --> 09:12.160]  you can win some Azure credits for your machine learning research plans.
[09:12.830 --> 09:18.860]  The defensive track is already over. We received two submissions that passed the minimum
[09:18.860 --> 09:27.500]  requirements, and the offensive track already started, so I highly recommend you to go to our
[09:27.500 --> 09:38.100]  website, analsec.io, and check out what the competition is. In this year, we have used the
[09:38.400 --> 09:46.920]  malware families, and if you go to our website, review the terms of service, and then you can
[09:46.920 --> 09:55.200]  download the 50 provided malware samples, and after that, it is your time to modify the samples
[09:55.200 --> 10:06.340]  in order to evade the detection. And new to this year, you can use an API to check your samples or
[10:06.340 --> 10:13.600]  submit your samples. I also recommend you that you verify that the malware functionality remains
[10:13.600 --> 10:22.520]  the same in your local Windows box. Then, when you upload the zip files, or you can just upload the
[10:22.520 --> 10:31.440]  partial zip files, meaning that you only submit some of the samples and not all of them,
[10:31.440 --> 10:39.940]  you can receive one point for each bypassed machine learning model, which means that
[10:39.940 --> 10:48.760]  for every sample, you can get up to three points. And as usual, highest score wins.
[10:48.880 --> 10:56.940]  The details about this will be provided by Hyrum, and in order to claim your prize, you have to
[10:56.940 --> 11:06.520]  publish your solution. Please note that you have to keep the file names as it were in the
[11:06.520 --> 11:16.520]  original zip file. This helps us to track which file you modified originally. We also provide some
[11:16.520 --> 11:23.460]  additional tips and tricks you might use in this competition. Some of them may not make sense, but
[11:24.100 --> 11:30.440]  you can modify an executable in a lot of different ways. For example, you can add or
[11:30.440 --> 11:37.660]  remove signatures, change section names properties, modify the import or export tables, create TLS
[11:37.660 --> 11:44.620]  callbacks, change the PA header, fix or change the checksums, add, modify, or remove the version
[11:44.620 --> 11:51.760]  information, create new entry points, or just change some code or data in it.
[11:53.040 --> 12:01.340]  Still, it's not allowed to create droppers or self-extracting archives, because this will kind
[12:01.340 --> 12:08.660]  of defeat the purpose of the whole competition. And this year, keep in mind that multiple
[12:08.660 --> 12:13.770]  registration is against the rules, and it will result in immediate disqualification.
[12:14.190 --> 12:20.550]  Please do join our Slack channel, where you can discuss everything with us,
[12:20.550 --> 12:29.570]  and you can also discuss your progress with the other participants of this competition.
[12:31.970 --> 12:40.250]  Just a side note, the whole frontend was created in Python Flask admin,
[12:40.250 --> 12:46.550]  we are using Cloudflare, Nginx, and GUnicorn for scalability and performance reasons.
[12:46.610 --> 12:55.390]  There are some backend scripts running with Python, scheduled by Chrome, and as it was
[12:55.390 --> 13:03.490]  the case last year, we still use the VMware Sandbox to evaluate the samples. As mentioned,
[13:03.490 --> 13:12.360]  we already have an API, so if you want to check your sample against machine learning,
[13:12.690 --> 13:17.650]  against all the machine learning models, or just against one machine learning model,
[13:17.650 --> 13:25.930]  you can use the API just to do that. And also, you can use the API to get the results.
[13:25.930 --> 13:32.530]  And if you are satisfied with bypassing the machine learning models, you can upload your
[13:32.530 --> 13:40.610]  zip files and query the zip status and the sample statuses as well with the API keys.
[13:40.690 --> 13:46.570]  This is all I wanted to share with you guys, but please welcome Hiram, who will
[13:47.490 --> 13:54.630]  present you some other tips and tricks you can use to win this competition. Thank you.
[13:56.640 --> 14:02.940]  I'm going to describe to you the example solution in the machine learning security evasion
[14:02.940 --> 14:10.500]  competitions attacker challenge that has just begun. The models that you'll be attacking this
[14:10.500 --> 14:16.780]  year have been submitted by participants of the previous round in the defender challenge.
[14:16.940 --> 14:22.960]  Two of the models from the previous round have qualified to be included in this round.
[14:22.960 --> 14:29.740]  In addition, we have hosted our own model for you to attack. That model is trained on the
[14:29.740 --> 14:37.200]  Ember dataset and includes some basic capability to detect adversarial examples. The source code
[14:37.200 --> 14:44.220]  and model weights for this defended Ember model are provided on the competition's GitHub site.
[14:44.880 --> 14:52.940]  However, the remaining models are to you complete black boxes where you only get to observe the hard
[14:52.940 --> 15:00.040]  label predictions. That is a zero or a one for an output that you provide to the machine learning
[15:00.040 --> 15:07.680]  models. The final leaderboard ranking will be set by the following rank ordered criteria.
[15:07.980 --> 15:14.620]  First, the total number of evasions with one point for each of the three ML models
[15:14.620 --> 15:22.060]  times 50 malware samples, meaning that the maximum score is 150. Remember though that each
[15:22.060 --> 15:28.640]  evasive sample must reproduce its original functionality in a sandbox in order to be
[15:28.640 --> 15:37.020]  awarded a point. Functionality is verified only when you upload a zip file containing
[15:37.020 --> 15:44.760]  your candidate malware samples. It will not be verified when you merely query the machine
[15:44.760 --> 15:52.840]  learning models through the API. In the event of a tie for point number one, contestants will
[15:52.840 --> 16:00.820]  be ranked by the number of model queries used through the API. And lastly, the timestamp of
[16:00.820 --> 16:07.180]  your final zip upload would break any subsequent tie. More than likely though, we won't get to
[16:07.180 --> 16:13.800]  point number three, so you should feel incentivized to continue competing right up until the competition
[16:13.800 --> 16:19.540]  deadline. Even if you see a perfect score on the leaderboard, because you might achieve that same
[16:19.540 --> 16:26.200]  perfect score but do it more efficiently. So as a contestant, you can choose any strategy you'd
[16:26.200 --> 16:33.740]  like to compete. But to demonstrate one possible strategy, we have released some example code on
[16:33.740 --> 16:39.940]  the competition's GitHub site. You can find more information about the nitty-gritty details of this
[16:39.940 --> 16:47.000]  approach on the website. It essentially is using a discrete optimization technique over a space of
[16:47.000 --> 16:52.880]  functionality preserving file modifications. However, the general strategy might be more useful
[16:53.580 --> 17:00.400]  for you to adopt. The strategy consists of, is really simple, consists of doing a bunch of bulk
[17:00.400 --> 17:06.760]  work using an algorithm in part A and then kind of batting cleanup for manual manipulation of
[17:06.760 --> 17:12.880]  malware samples in part B. And I'm going to be describing and demoing the code for part A today.
[17:13.100 --> 17:19.300]  In it, because we'd like to be efficient in the number of queries against the hosted machine
[17:19.300 --> 17:26.400]  learning models, we'll actually break this attack into two parts. An offline attack where we use the
[17:26.400 --> 17:33.020]  defended Ember model for which we have code to kind of work out our strategy and generate initial
[17:33.020 --> 17:41.020]  malware samples. We hope that those seeds might evade some of the online models that are hosted.
[17:41.520 --> 17:46.400]  And then in the online attack, we'll take those initial seeds and the algorithm will further
[17:46.400 --> 17:54.760]  optimize and discover additional file modifications required to evade the online hosted models.
[17:54.760 --> 18:01.060]  Some tricks that we're using here include label smoothing, where we're converting the hard label
[18:01.060 --> 18:09.940]  outputs into a soft score by averaging four things, the three hard label outputs from the
[18:09.940 --> 18:18.800]  hosted machine learning models, as well as a local score from a local machine learning Ember model
[18:18.800 --> 18:24.480]  that will be used as a heuristic to kind of guide the optimization process.
[18:25.520 --> 18:31.920]  So as I demo this code, I want you to please be aware that this code writes malware to disk. So
[18:31.920 --> 18:43.280]  please do run this code only using a Linux VM. To begin, we initialize the attack by analyzing a
[18:43.280 --> 18:51.260]  connect collection of benign files. This init subcommand extracts elements of these benign files
[18:51.260 --> 18:58.420]  that will be later injected into the malware. To launch our offline attack, we'll run a local
[18:58.420 --> 19:06.080]  copy of the Ember model in the top window. Then in the bottom window, we'll use the run command,
[19:06.080 --> 19:11.800]  passing in malware samples that we downloaded after registering on the website.
[19:11.980 --> 19:19.620]  The tool will then write successful evasion attempts to pass one slash success that we've
[19:19.620 --> 19:25.840]  specified in the command line, and failed attempts to pass two slash failure folder.
[19:26.020 --> 19:34.120]  And also included in each output directory will be the history of file modifications
[19:34.830 --> 19:42.130]  that will be useful if we'd like to pick up to resume a failed attempt.
[19:43.470 --> 19:48.140]  So to demonstrate that, in a second pass of the offline attack,
[19:48.140 --> 19:56.300]  we'll start with the pass one failures and iterate on the optimization approach,
[19:56.300 --> 20:00.830]  again storing successes and failures to a pass two directory.
[20:04.210 --> 20:09.100]  So after doing that a number of times and having collected a bunch of candidate samples offline
[20:09.670 --> 20:17.140]  that evade the local defended Ember model, we'll then use those candidates as seeds for
[20:17.140 --> 20:23.820]  an online attack, which now counts against our API query usage. To do an online attack,
[20:23.820 --> 20:31.820]  simply use this tool with the dash dash online flag, and the optimization will then continue
[20:32.720 --> 20:38.900]  trying to find file modifications that will bypass all three of the hosted models.
[20:41.200 --> 20:46.080]  Of course, you want to do perhaps as many, you know, as many iterations as necessary,
[20:46.080 --> 20:51.560]  and the online version of this attack. But after you've done so, in a final pass of the online
[20:51.560 --> 21:00.540]  attack, you can now collect the successful samples into a zip file that you would then validate in a
[21:00.540 --> 21:09.000]  Windows 10 virtual machine, and then upload to the website for validation and leaderboard scoring.
[21:09.000 --> 21:13.760]  I want to point out that since there is a chance that by running this code,
[21:13.760 --> 21:19.000]  file modifications might break some of the samples, you should always run these samples
[21:19.000 --> 21:25.620]  in a Windows 10 sandbox before uploading to the competition website. Also note that zip file
[21:25.620 --> 21:33.520]  uploads contribute against your API query count. So it is to your benefit to double check your work
[21:33.520 --> 21:39.120]  and make sure that any files you upload are functional, so you don't have to redo that work
[21:39.120 --> 21:46.420]  and upload again. As a final note, kind of tricky that since the hosted models might be actually
[21:46.420 --> 21:53.100]  changing state and learning from the queries that you and others are giving them, there's a
[21:53.100 --> 22:00.700]  possibility that an evasive variant sample that you discovered along the way may no longer evade
[22:00.700 --> 22:05.360]  a model by the time that you upload your zip file. So be... I don't know that will be the case, but
[22:05.360 --> 22:11.300]  please be aware that that is a possibility. So with that, good luck on the competition.
[22:12.220 --> 22:17.920]  Visit the website at emailsecta.io. The competition will run for over six weeks.
[22:18.540 --> 22:25.720]  And those who are ranked first and second on the leaderboard will win our
[22:25.720 --> 22:32.220]  grand and first prizes respectively, so long as they publish their solution.
[22:33.620 --> 22:36.560]  And with that, I'd like to thank our sponsors,
[22:36.560 --> 22:42.020]  Microsoft and Cujo AI, with partners MRG Efitas and VMRay.
